Predicting RNA distance-based contact maps by integrated deep learning on physics-inferred secondary structure and evolutionary-derived mutational coupling

Abstract Motivation Recently, AlphaFold2 achieved high experimental accuracy for the majority of proteins in Critical Assessment of Structure Prediction (CASP 14). This raises the hope that one day, we may achieve the same feat for RNA structure prediction for those structured RNAs, which is as fundamentally and practically important similar to protein structure prediction. One major factor in the recent advancement of protein structure prediction is the highly accurate prediction of distance-based contact maps of proteins. Results Here, we showed that by integrated deep learning with physics-inferred secondary structures, co-evolutionary information and multiple sequence-alignment sampling, we can achieve RNA contact-map prediction at a level of accuracy similar to that in protein contact-map prediction. More importantly, highly accurate prediction for top L long-range contacts can be assured for those RNAs with a high effective number of homologous sequences (Neff > 50). The initial use of the predicted contact map as distance-based restraints confirmed its usefulness in 3D structure prediction. Availability and implementation SPOT-RNA-2D is available as a web server at https://sparks-lab.org/server/spot-rna-2d/ and as a standalone program at https://github.com/jaswindersingh2/SPOT-RNA-2D. Supplementary information Supplementary data are available at Bioinformatics online.


Introduction
3D structures of non-coding RNAs are the key for understanding their biological functions. Although these structures can be obtained by X-ray crystallography, nuclear magnetic resonance (NMR) or cryogenic electron microscopy, the low throughput and costly nature of these experimental techniques and the challenges associated with RNA structure determinations has led to fewer than 3% deposited structures in protein data bank (Rose et al., 2017) containing RNAs and fewer than 0.01% of 18 million non-coding RNAs collected in RNAcentral (Consortium, 2020) with known structures. As a result, many computational and experimental methods have been developed for predicting or probing RNA secondary structure and base contacts (Cai et al., 2020;Janssen and Giegerich, 2015;Lorenz et al., 2011;Luo et al., 2021;Reuter and Mathews, 2010;Solayman et al., 2022) along with 1D to multi-dimensional structural probing data (Carlson et al., 2018;Kubota et al., 2015;Tinoco and Bustamante, 1999) to act as restraints for RNA 3D structure prediction (Watkins et al., 2020;Zhang et al., 2020bZhang et al., , 2020c. Recently, computational prediction of RNA 3D structures has been improved by restraining distance-based contacts from direct coupling analysis analysis (DCA) (De Leonardis et al., 2015a;Wang et al., 2017a;Weinreb et al., 2016). DCA restraints produced by these methods, however, were limited to <4000 families in the RFAM (Kalvari et al., 2018) database, which provides multiplesequence-alignment (MSA) manually curated according to experimentally determined secondary structures. Although this limitation was removed by the development of a pipeline named RNAcmap (Zhang et al., 2021) that searches for and aligns with co-variant homologous sequences automatically, the accuracy of DCA predictors remains low, especially for those RNAs with few homologous sequences (Pucci et al., 2020).
In the past few years, predicting distance-based contact maps of proteins has been improved significantly over DCA predictors by utilizing an ensemble of deep neural networks with improved evolutionary profiles and DCA results as an input (Hanson et al., 2018;Li et al., 2021;Wang et al., 2017b;Yang et al., 2020). Inspired by these studies, we developed a 2D deep-learning-based predictor SPOT-RNA (Singh et al., 2019(Singh et al., , 2021a for secondary and tertiary base pairs and Sun et al. established the method for predicting distance-based contact maps [RNAContact (Sun et al., 2021)]. RNAContact used evolutionary information; however, it neither took advantage of correlated mutations for deeper homology search nor used correlated mutations as input. Moreover, the method seems to perform poorly for the RNAs with few homologous sequences, which are the case for the majority of RNAs.
In this work, we improved the accuracy of predicting RNA distance-based contact map with improved co-evolutionary information from RNAcmap (Zhang et al., 2021) and multiple sequence alignment sampling techniques used in trRosetta (Yang et al., 2020) and AlphaFold (Senior et al., 2020). Moreover, we used predicted secondary structures from a single-sequence-folding-based technique RNAfold as an input to improve the baseline performance for those RNAs without homologs. The new method, called SPOT-RNA-2D, substantially improves over existing methods in distance-based contact prediction, even at the single-sequence level.

Datasets
We used the same benchmark datasets from our previous work on RNA backbone angle prediction (Singh et al., 2021b) (SPOT-RNA-1D) for training, validation and testing. The SPOT-RNA-1D datasets consist of a training set (TR) of 286 RNA chains, a validation set (VL) of 30 RNA chains, and three test sets (TS1, TS2, TS3) of 63, 30 and 54 RNA chains, respectively, all prepared by using highresolution RNA structures from PDB (Rose et al., 2017).
More specifically, we downloaded all the high-resolution (<3.5 Å ) X-ray structures from PDB on 3 October 2020. These RNA structures were split into individual chains using a PDBParser from Biopython (Cock et al., 2009) and then sequences clustered using CD-HIT-EST (Fu et al., 2012) at the lowest allowed identity cut-off of 80%. Non-clustered RNAs were kept for the validation and test sets while the remaining clustered RNAs for the training set.
As the 80% sequence identity cut-off may not be strict enough, therefore, we further used the BLAST-N (Altschul et al., 1997) tool on non-clustered RNAs against training set and within themselves with a large e-value cut-off of 10. Any non-clustered sequence hits with the training set were removed from the training set, and any non-clustered sequence hits within themselves were also removed. After the CD-HIT-EST and BLAST-N filtering on non-clustered RNAs chains, we randomly split these RNAs chains into one validation (VL) and two test sets (TS1 and TS2).
Furthermore, we made VL and TS2 non-redundant even at the remote-homolog level from the other datasets (TR, TS1) and within themselves (VL, TS2). To achieve non-redundancy at the remotehomolog level, we build a covariance model for RNAs in VL and TS2. The covariance was built by first searching the query RNA (from VL, TS2) against NCBI's database for homologs using BLAST-N. Then the multiple sequence alignment (MSA) of homologs along with consensus secondary structure (CSS) from 3D structural files was used to build the covariance model using the cmbuild program from the INFERNAL (Nawrocki and Eddy, 2013) tool. Finally, the covariance model of VL and TS2 RNAs was searched against the TR and TS1 using the cmsearch program from INFERNAL with an E-value cut-off of 0.1 for VL and 10 for TS2. Any hits of covariance models were removed from TR and TS1. Similarly, VL and TS2 were made non-redundant at the remotehomolog level within themselves. We used an E-value cut-off of 0.1 for VL to maintain a reasonable number of RNAs in the validation set (VL) and 10 for TS2 to make this test set as strict as possible for benchmarking. Tertiary structure predictions were evaluated on the TS2 set as it represents the most challenging set of targets for SPOT-RNA-2D.
The final X-ray dataset consists of 286, 30, 63 and 30 RNA  chains for TR, VL, TS1 and TS2, respectively. The distribution of the datasets, such as the number of RNA chains in each set, median and maximum sequence lengths, type of base-pairs, the average number of distance-based contacts, is shown in Supplementary  Table S1. We used the DSSR (Lu et al., 2015) tool to extract different base pairs from 3D structural files.
These X-ray datasets were mapped to Rfam families (using the https://rfam.xfam.org/ website), to analyze the distribution and overlap of datasets in terms of Rfam families. As shown in Supplementary Table S2, few Rfam families are over-represented compared to others. This imbalance reflects the distribution of RNA 3D structures submitted to PDB. The utilization of CD-HIT-EST and BLAST-N to prepare test set TS1 remove Rfam family overlap of nearly 63% RNAs in comparison to training data as shown in Supplementary Table S2. The criterion of CD-HIT-EST and BLAST-N to prepare test sets has been used in past by RNAsol (Sun et al., 2019), RNAsnap2 (Hanumanthappa et al., 2021) and RNAContact (Sun et al., 2021) methods. In addition, we further utilized INFERNAL to prepare datasets VL and TS2 by removing Rfam families overlap with training data nearly 100% except 1 RNA (2qus_A) from TS2. Moreover, we prepared another test set (TSrfam) which consists of 69 RNAs from TS1 and TS2 that do not share any Rfam families overlap with training data. Performance on the test set TS-rfam is mentioned separately in Section 3.
During the preparation of the above datasets, we purposely kept RNA chains belong to the RNA-Puzzles (Cruz et al., 2012;Miao et al., 2015Miao et al., , 2017Miao et al., , 2020 in the test sets as many as possible because RNA-Puzzles RNAs are widely used for benchmarking RNA 3D models. There are 12 RNA chains belong to RNA-Puzzles, which was listed as a separate test set (named RNA-Puzzles) for benchmarking the predictors.
In addition to the above three test sets (TS1, TS2 and RNA-Puzzles), we prepared an additional test set (TS3) using NMR structures. To prepare TS3, we downloaded all the NMR structures (707) from the PDB (Rose et al., 2017) on 5 April 2021. We made these RNAs non-redundant within themselves and from TR, VL, TS1 and TS2 using the exact same criterion as TS1 and obtained 54 RNA chains for benchmarking.
To obtain the distance-based contact labels, we defined two nucleotides in contact if the distance between the nearest-heavy atoms of these two nucleotides is <8 Å as done previously (De Leonardis et al., 2015a;Weinreb et al., 2016;Zhang et al., 2021). Also, the Number of EFFective (N eff ) for all the RNAs were obtained using GREMLIN tool, where N eff is defined as the sum of weights after down-weighting each sequence by the number of neighbours above a pairwise sequence similarity cutoff of 0.8.

Input features
The input features for this work follow our previous base-pair predictor [SPOT-RNA2 (Singh et al., 2021a)], which illustrated the importance of both single-sequence-based and evolutionary-profilebased features. Both types of features were used here for RNA distance-based contact map prediction, as shown in Figure 1A.
Single-sequence-based features include one-hot encoding of an input RNA sequence of size L Â 4, with ð1; 0; 0; 0Þ; ð0; 1; 0; 0Þ; ð0; 0; 1; 0Þ and ð0; 0; 0; 1Þ for four nucleotides (A, U, G and C), respectively, where L is the sequence length. Along with one-hot encoding, we used the predicted base-pair probability of size L Â L from the single-sequencebased method RNAfold (Lorenz et al., 2011). We preferred RNAfold over other predictors because it is one of the most accurate nonmachine learning methods. We used a non-machine learning-based method to avoid any potential bias during performance evaluation on test sets.
All 1D features (one-hot encoding and PSSM, L Â 4) were converted into 2D features of size L Â 16 using the outer-concatenation function as described in RaptorX-Contact (Wang et al., 2017b). Finally, 1D and 2D features were concatenated into a feature vector of size L Â L Â 18 as an input to the deep neural network model, as shown in Figure 1B. These input features were standardized to have a zero mean with a unity variance according to the mean and the standard deviation in the training set before feeding into the deep neural network architecture.

Neural network architecture
The deep neural network architecture used in this work was inspired by our previous work SPOT-RNA (Singh et al., 2019) and SPOT-RNA2 (Singh et al., 2021a) for RNA secondary structure prediction. SPOT-RNA and SPOT-RNA2 utilized an ensemble of Residual Convolutional Neural Networks (He et al., 2016) (ResNets). Similar to SPOT-RNAs, we used an ensemble of ResNets based on the generalized model architecture shown in Figure 1B.
The architecture of SPOT-RNA-2D (shown in Fig. 1B) consists of an initial convolutional layer with a kernel size of 3 Â 3 and N F filters followed by N A ResNet blocks (Block-A). A single ResNet block consists of two pre-activated convolutional layers with a kernel size of 3 Â 3 and N F filters. The input to convolutional layers was also normalized by using layer normalization (Ba et al., 2016). A dropout rate of 50% was used to avoid overfitting of model weights on training data.
Finally, a fully connected (FC) output layer with a single node and a sigmoid activation function was used. It predicts the upper triangular base-pair probability matrix of size L Â L (as shown in Fig. 1B), where L is the length of the input sequence. A single threshold value was used to decide whether two nucleotides are in a distance-based contact or not. The value of the threshold was obtained by optimizing the F1-score on the validation set (VL).
We trained many models based on the architecture shown in Figure 1B by grid search for range of ResNets blocks (N A ) and filters (N F ). N A was varied from 1 to 20 and N F from 32 to 96 with a step increase of 1 and 8, respectively. Based on the performance of the validation set, we choose the four best models for the ensemble. Model architecture parameters for these four models are shown in Supplementary Table S3.
All the models were implemented using Google's TensorFlow framework (Version-1.15) and trained using Nvidia GTX TITAN X graphics processing unit (GPU). For training, an Adam optimizer (Kingma and Ba, 2014) with a learning rate of 0.005 was used. A ELU function (Clevert et al., 2015) was used to apply non-linear activation to the output of every layer except the output FC layer. Model hyper-parameters such as optimizers, activation functions and dropout rates were obtained from our previous work of RNA secondary structure prediction (Singh et al., 2019(Singh et al., , 2021a).

Tertiary structure modelling
RNA 3D structure models were generated with the rna_denovo application of the Rosetta molecular modelling suite. Inputs were generated from the FARFAR2 (Watkins et al., 2020) (Lu et al., 2015)] were used to pre-assemble canonical A-form helices. Pseudoknotted base pairs were removed with the RemovePseudoknots application from the RNAstructure package (Reuter and Mathews, 2010). Predicted tertiary contacts were then identified based on previously defined criteria (De Leonardis et al., 2015b), which excludes local sequence neighbours (ji À jj < 5) and contacts neighbouring assigned secondary structure pairs (k6f0; 1; 2g; l6f0; 1; 2g). Constraints were selected based on a fixed probability threshold (0.3 and 0.4 for SPOT-RNA-2D and SPOT-RNA-2D-Single). The probability threshold was optimized by a coarse grid search ( Supplementary Fig. S1) and the improvement of SPOT-RNA-2D is reasonably robust to the specific threshold. Ambiguous atom-pair constraints were applied to all combinations of heavy atoms such that only the minimum-energy atom-pair impacted the global energy during the minimization. Constraint energy terms used the long-range definition defined previously (De Leonardis et al., 2015b) with an empirical weight of 10 (FADE -100 26 20 -20 20). Models with tertiary constraints were generated by fragment assembly since the full atom refinement was not found to improve model accuracy when using tertiary constraints (not shown). Baseline models were generated with the full FARFAR2 protocol. In this work we report model accuracy as the C3' RMSD of the native structure with the top 1 low energy model from 200 simulations.

Performance evaluation
Similar to RNA secondary structure prediction, distance-based contact-map prediction is a binary classification problem. Thus, we used as a performance measure, where TP, FP and FN denotes true positives, false positives and false negatives, respectively, and i and j are the sequence positions of any two nucleotides in a sequence.
To evaluate binary classification performance metrics, the predicted base-pair probability was converted into binary classification using a threshold value that optimize the F1-score on the validation set for SPOT-RNA-2D (0.46) and SPOT-RNA-2D-Single (0.50). For other methods, we used a threshold value that optimizes the F1 score on the test set TS1. In addition, for an RNA of length L, we used a mean precision of top L=1; L=2; L=5 and L=10 long-range contacts [ji À jj ! 24] for benchmarking different predictors, where i and j are the of sequence positions of any two nucleotides in a sequence. This definition of long-range contacts [ji À jj ! 24] has been utilized in past by DIRECT (Jian et al., 2019) and RNAContact (Sun et al., 2021). Tertiary structure models were evaluated by superimposed C3' RMSD with the native structure computed with Biopython (Cock et al., 2009).

Feature contribution, MSA sampling and ensemble learning
SPOT-RNA-2D utilizes two single-sequence-based and two evolutionary-based input features for RNA distance-based contact map prediction, as shown in Figure 1A. Single-sequence-based features include one-hot encoding (1D) and predicted base-pair probability (2D) from RNAfold (Lorenz et al., 2011) (MFE). Evolutionary-based features include position-specific scoring matrix (PSSM, 1D) and direct coupling analysis (Balakrishnan et al., 2011;Hopf et al., 2017) (DCA, 2D) from RNAcmap generated multiplesequence-alignment (MSA-2). Table 1 shows the effect of the different input features on distance contact prediction used by a baseline model (Model-0) with the architecture shown in Figure 1B. This model was trained using a training set TR, validated by a validation set VL and tested on three non-redundant test sets TS1, TS2 and TS-rfam, all from highresolution X-ray structures (Singh et al., 2021b). The test set TS1 is made non-redundant from TR, VL, TS2 and within itself using CD-HIT-EST (Fu et al., 2012) (lowest identity cutoff of 0.8) followed by BLAST-N (Altschul et al., 1997) (E-value ¼ 10). The test set TS2 (a harder test set) further excludes remote structural homologs by searching the INFERNAL covariance model of TS2 sequences against TR, VL, TS2 and within itself. The test set TS-rfam was prepared from TS1 and TS2 by extracting 69 RNAs that do not overlap with any Rfam family in the training data.
As shown in Table 1 Table S1), indicates the robustness of the training.
To further extract the evolution information, we performed MSA sampling during training that was found effective for protein structure prediction by Alphafold (Senior et al., 2020) and trRosetta (Yang et al., 2020). This was done by using 20 random samples of 50, 100, 200, 500, 1000 RNAs from the full MSA alignment (MSA-2 in Fig. 1A). We chose 20 random seeds because SPOT-RNA-2D models took about 20 epochs to converge. PSSM and DCA features were evaluated for these sampled MSA before training. As shown in Table 1, MSA sampling improved the performance of the baseline model for validation (VL) and two test sets (TS1 and TS2). The amount of performance improvement observed was most for the TS1 because of relatively higher N eff -value (median N eff ¼ 40) in TS1 than in VL (median N eff ¼2) and TS2 (median N eff ¼9).
Finally, we used ensemble learning by performing the average ensemble of the outputs of the best four models from many models trained using the generalized architecture shown in Figure 1B. As shown in Table 1, ensemble learning further improves for the performance for validation (VL), and three test sets (TS1, TS2 and TS-rfam). Supplementary Table S4 shows Rfam family-wise performance of SPOT-RNA-2D on 93 RNAs from TS1 and TS2 with families overlap with training data highlighted in color red. There is no obvious trend of higher performance on families that exists in training in comparison to families that do not exist in training data. This shows the robustness of trained model to generalize across Rfam families that are not in training data.

Comparison with other predictors
Two methods SPOT-RNA-2D and SPOT-RNA-2D-Single were developed with and without evolution information, respectively. They were compared to four existing direct coupling analysis (DCA) predictors (PLMC, mfDCA, plmDCA and GREMLIN) on two highresolution (<3.5 Å ) non-redundant test sets (TS1 and TS2) derived from PDB X-ray structures and one non-redundant test set (TS3) derived from PDB NMR structures. Another test set TS-rfam prepared from TS1 and TS2 by extracting 69 RNAs without any Rfam family overlap with training set (TR). Test sets TS1 and TS2 contain 12 RNA-Puzzles RNAs with results also shown separately. To further compare with the recently developed deep learning-based RNAContact (Sun et al., 2021) without biases, we removed redundant sequences in TS1, TS2, TS3, RNA-Puzzles and TS-rfam to the RNAContact training set and led to 21, 9, 52, 7 and 20 RNAs, respectively. We also downloaded the test set TS80 prepared by RNAContact. After removing the sequence identity of TS80 with our training data (TR) using the same criterion as RNAContact, we obtained 10 RNAs in TS80. We compared to DCA predictors on full test sets and to RNAContact on reduced test sets. Table 2 shows that RNAContact, SPOT-RNA-2D and SPOT-RNA-2D-Single (F1 > 0.58) are all substantially better than DCA predictors (F1 $ 0.3) for all test sets. SPOT-RNA-2D improves over RNAContact in F1-score by 18%, 3.5%, 32%, 22%, 16% and 8% on the reduced test sets TS1, TS2, TS3, RNA-Puzzles, TS80 and TSrfam, respectively. In fact, even the single-sequence-based predictor (SPOT-RNA-2D-Single) improves over RNAContact in F1-score by 1-3% better on TS1 and TS2 and 6-30% better on the TS3, RNA-Puzzles, TS80 and TS-rfam.
Overall, test sets TS2 and TS3 are more difficult to predict compared to the other test sets (TS1, RNA-Puzzles and TS80) for all the predictors. This is mainly due to the relatively low N eff -value of TS2 (median N eff ¼9, see Supplementary Table S1) and TS3 (median N eff ¼4, see Supplementary Table S1). Unlike evolutionary-profilebased predictors, SPOT-RNA-2D-Single is more consistent irrespective of N eff -value, showing the usefulness of a single-sequencebased predictor where evolutionary information is not available.
The large improvement of SPOT-RNA-2D over DCA methods can be further illustrated by the precision-recall curve ( Fig. 2A) on combined 147 RNAs from three test sets TS1, TS2 and TS3. Both SPOT-RNA-2D and SPOT-RNA-2D-Single are significantly better than all DCA predictors for any threshold values. In Figure 2B, RNAContact was compared to SPOT-RNA-2D separately on the combined reduced test sets (82 RNAs). SPOT-RNA-2D-Single offers a large improvement over RNAContact with a 12% increase in the area under the precision-recall curve, and SPOT-RNA-2D provides an additional improvement of 11% over SPOT-RNA-2D-Single. Similar large improvements can be illustrated with the receiver operating characteristic curves shown in Supplementary Figure S2A and B.
Another way to measure the method performance in contact prediction is the fraction of true positions in top L, L/2, L/5 and L/10 predictions (Precision). The most important contacts are those contacts with large differences in sequence positions (long-range or non-local contacts). Figure 3A and B compares the precisions in long-range contacts [ði À jÞ ! 24] given by DCA methods and RNAContact, respectively, with those by SPOT-RNA-2D and SPOT-RNA-2D-Single for five test sets (TS1, TS2, TS3, RNA-Puzzles and TS80). Again, a large improvement is observed.
To illustrate the impact of homologous sequences, Figure 4 shows the mean precision of long-range contacts as a function of N eff -value. The mean precisions for SPOT-RNA-2D and DCA predictors (Fig. 4A) increase with N eff -value with a large gap between them. Meanwhile, the performance of SPOT-RNA-2D-Single does not depend much on the N eff -value with similar performance to SPOT-RNA-2D for low N eff RNAs, highlighting the robustness of the method performance. Figure 4B compares the performance of SPOT-RNA-2D and SPOT-RNA-2D-Single to that of RNAContact on the combined reduced test sets (TS1, TS2 and TS3.) The dependence of RNAContact on N eff is surprisingly low, suggesting its ineffective use of evolutionary information. SPOT-RNA-2D yields a larger improvement over RNAContact for high N eff RNAs. Even SPOT-RNA-2D-Single based on the single-sequence only outperforms RNAContact for all values of N eff , except at N eff >100. The similar performance at low N eff between SPOT-RNA-2D and SPOT-RNA-2D-single indicates the effective use of evolutionary information by SPOT-RNA-2D without compromising performance for low N eff -value RNAs.

Base pairs and non-base pairs
Distance-based contacts involve paired bases and bases that are not paired but near each other. Base pairs include canonical, non-canonical, pseudoknots and multiples base pairs, all of which were annotated according to their native 3D structural files using the DSSR (Lu et al., 2015) tool. Non-base pairs include all distance-based contacts, excluding the base pairs. As a reference, we also included representative RNA secondary structure predictors such as RNAfold (Lorenz et al., 2011), LinearPartition (Zhang et al., 2020a), SPOT-RNA (Singh et al., 2019) and SPOT-RNA2 (Singh et al., 2021a). Comparison to SPOT-RNA and SPOT-RNA2 was made on a reduced set after removing redundancy to their training sets. F1-scores as a function of top L/n predictions given by various methods are shown in Supplementary Figure S3 for base pairs and Supplementary Figure S4 for non-base pairs. Results are further summarized in Table 3 by showing the mean highest F1-score given by each predictor in Supplementary Figures S3 or S4. The performance of all DCA predictors is in general poor for both base and non-base pairs. The standard folding-based secondary structure predictors RNAfold and LinearPartition improve over DCA methods for base pairs and comparable for non-base pairs. SPOT-RNA-2D has comparable mean F1-score for base pairs prediction in comparison to secondary structure predictors (RNAfold and LinearPartition), although it was not trained specifically for base pairs. More importantly, SPOT-RNA-2D is substantially more accurate for non-base pairs as non-base pairs comprise $85% of total distance-based contacts.
We also compared SPOT-RNA-2D and SPOT-RNA-2D-Single to our recently developed deep learning-based RNA secondary structure predictors SPOT-RNA and SPOT-RNA2. For a fair comparison, we removed test RNAs that have overlap with SPOT-RNA's training data which reduced TS1 and TS2 to 38 and 16 RNAs, respectively. As shown in Table 3, SPOT-RNA-2D has a comparable mean F1-scores in base pairs and significantly better mean F1-score in non-base pairs when compared to deep-learning-based secondarystructure predictors. SPOT-RNA-2D improved long-range F1-score over RNAContact by 71%, 42% and 97% for base pairs and 34%, 16% and 59% for non-base pairs for test set TS1, TS2 and TS3, respectively. SPOT-RNA-2D-Single also outperforms RNAContact on three test sets (TS1, TS2 and TS3) without using any evolutionary features. Figure 5 illustrates predictions of RNAContact, SPOT-RNA-2D-Single and SPOT-RNA-2D predictions (in the lower triangle) and native contacts (in the upper triangle) for three RNAs from the RNA-Puzzles test set. Native contacts in the upper triangular matrix consist of both base pairs and non-base pairs. Base pairs are shown in the color black and non-base pairs in the color red. Also, long-range stems are highlighted in color orange in predicted contact maps and corresponding actual 3D structures, and remaining stems in color red in their corresponding actual 3D structures. Figure 5A-C shows the prediction of high N eff -value (N eff ¼94) 2'-dG-II riboswitch (Matyjasik and Batey, 2019) (Chain A in PDB ID 6p2h, Puzzle-25) by RNAContact, SPOT-RNA-2D-Single and SPOT-RNA-2D, respectively. SPOT-RNA-2D is able to predict a native-contact like a pattern within top L long-range predicted contacts with a precision of 0.99. By

Tertiary structure modelling
Supplementary Table S5 summarizes the average RMSD of models from TS2 generated by Rosetta with the different input information. The baseline models [FARFAR2 (Watkins et al., 2020)] rely only on a predicted secondary structure to guide minimization and yield an average RMSD of 16.9 Å for the top 1 low energy models. For TS2 it appears that there is not much advantage to using the true secondary structure labels compared with those predicted by RNAfold when few tertiary constraints are available (FARFAR2 and SPOT-RNA-2D-Single). Using the RNAfold predicted secondary structure combined with tertiary contacts predicted by SPOT-RNA-2D improves the mean RMSD to 14.6 Å (16% reduction). Results are much better if RNA complex structures are excluded (25 targets remained) with a 20% reduction from the mean RMSD of 15.5 Å by FARFAR2 to 12.7 Å by SPOT-RNA-2D. We also report the typical time taken to generate a single model by SPOT-RNA-2D compared with FARFAR2 in Supplementary Figure S5. Despite the computational overhead associated with evaluating tertiary constraints, the SPOT-RNA-2D computational time is comparable with FARFAR2 by forgoing the costly full atom refinement stage. Models were generated using a single core Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60 GHz. Models generated with DSSR secondary structure and constraints derived from true contacts in the native structure represent a lower bound on modelling performance within the proposed framework. However, even with perfect contact prediction and a perfect secondary structure, only a mean RMSD of 8.1 Å (6.5 Å excluding complex structures) can be achieved. This suggests that even using perfect contact information is not sufficient to fold with Rosetta.
An example that demonstrates the efficacy of the SPOT-RNA-2D predicted contacts is the 2'-dG riboswitch (Pikovskaya et al., 2011) (Supplementary Fig. S6). Black dots in the upper diagonal of the contact maps represent base-paired nucleotides predicted by RNAfold and derived from DSSR in the native structure. The black color in the lower diagonal represents constraints that are satisfied by the top 1 model and the blue color is used to indicate constraints that were not satisfied by the respective model. While the secondary structure is correctly assigned by RNAfold, relative helix orientations are not recovered by the naïve FARFAR2 protocol which leads to a poor RMSD of the top 1 model with the native structure (17.7 Å ). The SPOT-RNA-2D predicted contact map includes critical tertiary interactions which enforce the correct antiparallel helical configuration when applied as constraints and significantly improve the RMSD to 6.6 Å . The evolution-derived, residue constraints bias